Recursive least square perceptron model for non-stationary and imbalanced data stream classification
نویسندگان
چکیده
Classifying non-stationary and imbalanced data streams encompasses two important challenges, namely concept drift and class imbalance. ‘‘Concept drift’’ (or nonstationarity) is changes in the underlying function being learnt, and class imbalance is vast difference between the numbers of instances in different classes of data. Class imbalance is an obstacle for the efficiency of most classifiers and is usually observed in two-class datasets. Previous methods for classifying non-stationary and imbalanced data streams mainly focus on batch solutions, in which the classification model is trained using a chunk of data. Here, we propose an online perceptron model. The main contribution is a new error model inspired from the error model of recursive least square (RLS) filter. In the proposed error model, non-stationarity is handled with the forgetting factor of RLS error model and for handling class imbalance two different errors weighting strategies are proposed. These strategies are verified with convergence and tracking theories from adaptive filters theory. The proposed methods is evaluated on two synthetic and six real-world two-class datasets and compared with seven previous online perceptron models. The results show statistically significant improvement to previous methods.
منابع مشابه
مکان یابی وفقی موبایل به روش آزمون باقیمانده
Determination of mobile localization with time of arrival (TOA) signal is a requirement in cellular mobile communication. In some of the previous methods, localization with non-line-of-sight (NLOS) paths can lead to large position error. Also for simplicity, in most simulations suppose non stationary actual environments as stationary. This paper proposes (residual test + recursive least square)...
متن کاملCost-Sensitive Perceptron Decision Trees for Imbalanced Drifting Data Streams
Mining streaming and drifting data is among the most popular contemporary applications of machine learning methods. Due to the potentially unbounded number of instances arriving rapidly, evolving concepts and limitations imposed on utilized computational resources, there is a need to develop efficient and adaptive algorithms that can handle such problems. These learning difficulties can be furt...
متن کاملLearning Framework for Non-stationary and Imbalanced Data Stream
Abstract—Although learning on non-stationary data and imbalanced data have been extensively studied in the literature separately, however little work has been done to tackle the imbalanced issue on nonstationary data stream as the joint probability distribution between the data and classes changes with time and may results skewed class distribution. Especially in airlines delay detection, data ...
متن کاملAnalysis of the SNR Estimator for Speech Enhancement Using a Cascaded Linear Model
Elimination of tainted noise and improving the overall quality of a speech signal is speech enhancement. To gain the advantage of individual algorithms we propose a new linear model and that is in the form of cascade adaptive filters for suppression of non-stationary noise. We have successfully deployed NLMS (Normalized Least Mean Square) algorithm, Sign LMS (Least Mean Square) and RLS (Recursi...
متن کاملA Recursive Exponential Filter For Time-Sensitive Data
A recursive formulation of an exponential smoothing filter is developed, within the framework of a least square error approach with data uncertainties that increase exponentially with time. An efficient implementation into Java is presented. By analogy to the Kalman filter, an interpretation of the gain as a ratio of uncertainties leads to a measure of validity for the recursive exponential fil...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Evolving Systems
دوره 4 شماره
صفحات -
تاریخ انتشار 2013